home *** CD-ROM | disk | FTP | other *** search
- LNO(5) Last changed: 4-9-99
-
-
- NNAAMMEE
- LLNNOO - Compiler loop nest optimization option group
-
- SSYYNNOOPPSSIISS
- --LLNNOO:: ...
-
- IIMMPPLLEEMMEENNTTAATTIIOONN
- IRIX systems
-
- DDEESSCCRRIIPPTTIIOONN
- This man page describes the loop nest optimization options accepted by
- the ff9900(1), ff7777(1), CCCC(1), cccc(1), and cc8899(1) commands.
-
- The --LLNNOO:: option group specifies options and transformations performed
- on loop nests. The --LLNNOO:: option group is enabled only if the --OO33
- option is also specified on the compiler command line.
-
- For information on the LNO options that are in effect during a
- compilation, use the --LLIISSTT::ooppttiioonnss==OONN option.
-
- You can specify more than one suboption to the --LLNNOO:: option either by
- using colons to separate each suboption or by specifying multiple
- options on the command line. For example, the following command lines
- are equivalent:
-
- f90 -LNO:auto_dist=ON:outer=OFF b.f
- f90 -LNO:auto_dist=ON -LNO:outer=OFF b.f
-
- Some --LLNNOO:: suboptions are specified with a setting that either enables
- or disables the feature. To enable a feature, specify the argument
- either alone or with ==11, ==OONN, or ==TTRRUUEE. To disable a feature, specify
- the suboption with either ==00, ==OOFFFF, or ==FFAALLSSEE. For example, the
- following command lines are equivalent:
-
- f90 -LNO:auto_dist:blocking=OFF:oinvar=FALSE a.f
- f90 -LNO:auto_dist=1:blocking=0:oinvar=OFF a.f
-
- For brevity, this man page shows only the OONN or OOFFFF settings to
- suboptions, but 00, 11, TTRRUUEE, and FFAALLSSEE are also allowed as settings. In
- addition, this man page shows the abbreviated form for some of the
- suboption names. You can use either the abbreviation or the complete
- suboption name when using the suboptions. The following is a list of
- the abbreviations and the complete suboption names:
-
- Complete name Abbreviation
-
- oouutteerr__uunnrroollll oouu
-
- aassssoocciiaattiivviittyy aassssoocc
-
- cclleeaann__mmiissss__ppeennaallttyy ccmmpp
-
- ddiirrttyy__mmiissss__ppeennaallttyy ddmmpp
-
- ccaacchhee__ssiizzee ccss
-
- iiss__mmeemmoorryy__lleevveell iiss__mmeemm
-
- lliinnee__ssiizzee llss
-
- ttllbb__eennttrriieess ttllbb
-
- ttllbb__cclleeaann__mmiissss__ppeennaallttyy ttllbbccmmpp
-
- pprreeffeettcchh__lleevveell ppff
-
- See "F77 LNO Directives" at the end of this man page for a summary of
- the F77 directives for LNO. See the _M_I_P_S_p_r_o _7 _F_o_r_t_r_a_n _9_0 _C_o_m_m_a_n_d_s _a_n_d
- _D_i_r_e_c_t_i_v_e_s _R_e_f_e_r_e_n_c_e _M_a_n_u_a_l, for a discussion of the Fortran 90 LNO
- directives. See _M_I_P_S_p_r_o _C _a_n_d _C++ _P_r_a_g_m_a_s, for descriptions of the C
- and C++ LNO #pragma directives.
-
- The descriptions to the suboptions to --LLNNOO:: are divided into the
- following categories:
-
- * General options
-
- * Transformation options
-
- * Cache memory management options
-
- * TLB options
-
- * Prefetch options
-
- The --LLNNOO option accepts the following general suboptions:
-
- SSuubbooppttiioonn AAccttiioonn
-
- aauuttoo__ddiisstt[[ == (( OONN||OOFFFF ))]]
- Distributes local arrays in common blocks that are
- accessed in parallel. The default is OOFFFF.
-
- This optimization works with either automatic parallelism
- or parallelism using directives; it is always safe, and
- does not affect the layout of arrays in virtual space, and
- does not incur addressing overhead.
-
- ffiissssiioonn==_n Controls loop fission. _n can be one of the following:
-
- 0 Disables loop fission.
-
- 1 Performs normal fission as necessary. This is the
- default.
-
- 2 Specifies that fission be tried before fusion.
-
- If --LLNNOO::ffiissssiioonn==_n and --LLNNOO::ffuussiioonn==_n are both set to 1 or
- to 2, fusion is performed.
-
- ffuussiioonn==_n Controls loop fusion. _n can be one of the following:
-
- 0 Disables loop fusion.
-
- 1 Performs standard outer loop fusion. This is the
- default.
-
- 2 Specifies that outer loops should be fused, even if it
- means partial fusion.
-
- The compiler attempts fusion before fission. The compiler
- performs partial fusion if not all levels can be fused in
- the multiple-level fusion.
-
- If --LLNNOO==ffiissssiioonn==_n and --LLNNOO::ffuussiioonn==_n are both set to 1 or
- to 2, fusion is performed.
-
- ffuussiioonn__ppeeeelliinngg__lliimmiitt==_n
- Sets the limit for the number of iterations allowed to be
- peeled in fusion, where _n >= 0. By default, _n=5.
-
- ggaatthheerr__ssccaatttteerr==_n
- Performs gather-scatter optimizations. _n can be one of
- the following:
-
- 0 Disables all gather-scatter optimization.
-
- 1 Performs gather-scatter optimizations on non-nested IIFF
- statements. This is the default.
-
- 2 Performs multi-level gather-scatter optimizations.
-
- iiggnnoorree__pprraaggmmaass[[ == (( OONN||OOFFFF ))]]
- Specifies that the command line options override
- directives in the source file. The default is OOFFFF.
-
- llooccaall__ppaadd__ssiizzee==_n
- Specifies the amount by which to pad local array
- dimensions. By default, the compiler automatically
- chooses the amount of padding to improve cache behavior
- for local array accesses.
-
- nnoonn__bblloocckkiinngg__llooaaddss[[ == (( OONN||OOFFFF ))]]
- (C/C++ and F77 only) Specifies whether the processor
- blocks on loads. If not set, the default of the current
- processor is used.
-
- ooiinnvvaarr[[ == (( OONN||OOFFFF ))]]
- Controls outer loop hoisting. The default is OONN.
-
- oopptt==_n Controls the LNO optimization level. _n can be one of the
- following:
-
- 0 Disables nearly all loop nest optimization.
-
- 1 Peforms full loop nest transformations. This is the
- default.
-
- oouutteerr[[ == (( OONN||OOFFFF ))]]
- Enables or disables outer loop fusion. The default is OONN.
-
- ppaarraalllleell__oovveerrhheeaadd==_n_u_m__c_y_c_l_e_s
- Overrides internal compiler estimates concerning the
- efficiency to be gained by executing certain loops in
- parallel rather than serially. _n_u_m__c_y_c_l_e_s specifies the
- number of processor cycles. Specify an integer for
- _n_u_m__c_y_c_l_e_s. The default is 2600.
-
- ppuurree==_n (MIPSpro C/C++)
- Tells the compiler how to use the ##pprraaggmmaa ppuurree and ##pprraaggmmaa
- nnoo ssiiddee eeffffeeccttss directives when performing parallel
- analysis. ##pprraaggmmaa nnoo ssiiddee eeffffeeccttss may read its arguments
- and unspecified global data; ##pprraaggmmaa ppuurree can read only
- its arguments; neither directive can modify its arguments
- or global data. Specify 00, 11, or 22 for _n, as follows:
-
- _n VVaalluuee DDeessccrriippttiioonn
-
- 00 The compiler ignores the ##pprraaggmmaa ppuurree and
- ##pprraaggmmaa nnoo ssiiddee eeffffeeccttss directives when
- gathering information for parallelization
- analysis.
-
- 11 The compiler interprets the ##pprraaggmmaa ppuurree and
- ##pprraaggmmaa nnoo ssiiddee eeffffeeccttss directives per their
- definitions when gathering information for
- parallelization analysis.
-
- 22 The compiler interprets ##pprraaggmmaa nnoo ssiiddee eeffffeeccttss
- as ##pprraaggmmaa ppuurree when gathering information for
- parallelization analysis. This option is
- provided because you may declare a function to
- have no side effects, when in fact, it is pure,
- except for references to system variables such
- as errno. In these cases, you can treat no side
- effects functions as if they were pure for the
- purposes of parallelization.
-
- ppuurree==_n (MIPSpro 7 Fortran 90)
- Specifies the extent to which the compiler should consider
- the effect of a PPUURREE procedure or a !!DDIIRR$$ NNOOSSIIDDEEEEFFFFEECCTTSS
- directive when performing parallel analysis. Specify 00,
- 11, or 22 for _n, as follows:
-
- _n VVaalluuee DDeessccrriippttiioonn
-
- 00 Directs the compiler to ignore a PPUURREE attribute
- and the !!DDIIRR$$ NNOOSSIIDDEEEEFFFFEECCTTSS directive.
-
- 11 Directs the compiler to consider the fact that
- PPUURREE procedures and procedures preceded by a
- !!DDIIRR$$ NNOOSSIIDDEEEEFFFFEECCTTSS directive do not modify
- global data or procedure arguments when
- performing parallel analysis. Default.
-
- 22 Asserts to the compiler that that PPUURREE
- procedures and procedures preceded by a
- !!DDIIRR$$ NNOOSSIIDDEEEEFFFFEECCTTSS directive do not modify
- global data, do not modify procedure dummy
- arguments, and do not access global data.
-
- This setting asserts that the only non-local
- data items referenced by the procedure are the
- dummy arguments to the procedure. This is an
- extension of the Fortran standard meaning of
- PPUURREE and of the meaning of !!DDIIRR$$ NNOOSSIIDDEEEEFFFFEECCTTSS.
- At this setting, more aggressive parallelization
- can occur if procedures are known not to access
- global data.
-
- vviinnttrr[[ == (( OONN||OOFFFF ))]]
- Specifies that vectorizable versions of the math intrinsic
- functions should be used. The default is OONN.
-
- For information on the math intrinsic functions, see
- mmaatthh(3M).
-
- The loop transformation arguments allow you to control cache blocking,
- loop unrolling, and loop interchange. They are as follows:
-
- bblloocckkiinngg[[ == (( OONN||OOFFFF ))]]
- Specify bblloocckkiinngg==OOFFFF to disable the cache blocking
- transformation. The default is OONN.
-
- bblloocckkiinngg__ssiizzee==[_n_1][,,_n_2]
- Specifies a block size that the compiler must use when
- performing any blocking. When using the MIPSpro 7 Fortran
- 90 compiler, specify a value for _n_2 when using a 2-level
- cache. For _n_1 or _n_2, specify a positive integer number that
- represents the number of iterations.
-
- iinntteerrcchhaannggee[[ == (( OONN||OOFFFF ))]]
- Specifies whether or not loop interchange optimizations are
- performed. The default is OONN.
-
- oouu==_n Indicates that all outer loops for which unrolling is legal
- should be unrolled by _n, where _n is a positive integer. The
- compiler unrolls loops by this amount or not at all.
-
- oouu__ddeeeepp[[ == (( OONN||OOFFFF ))]]
- Specifies that for loops with 3-deep, or deeper, loop nests,
- the compiler should outer unroll the wind-down loops that
- result from outer unrolling loops further out. This results
- in large code size, but it generates much faster code
- whenever wind-down loop execution costs are important. The
- default is OONN.
-
- oouu__ffuurrtthheerr==_n
- Specifies whether or not the compiler performs outer loop
- unrolling on wind-down loops. Specify an integer for _n.
-
- oouu__mmaaxx==_n Indicates that the compiler can unroll as many as _n copies
- per loop, but no more.
-
- oouu__pprroodd__mmaaxx==_n
- Indicates that the product of unrolling of the various outer
- loops in a given loop nest is not to exceed _n, where _n is a
- positive integer. The default is 16.
-
- ppwwrr22[[ == (( OONN||OOFFFF ))]]
- (C/C++ and F77 only) Specifies whether to ignore the leading
- dimension (set to OOFFFF to ignore).
-
- You can disable additional unrolling by specifying
- --LLNNOO::oouu__ffuurrtthheerr==999999999999. Unrolling is enabled as much as is
- sensible by specifying --LLNNOO::oouu__ffuurrtthheerr==33.
-
- Certain arguments allow you to describe the target cache memory
- system. The numbering in the following arguments starts with the
- cache level closest to the processor and works outward:
-
- aassssoocc11==_n, aassssoocc22==_n, aassssoocc33==_n, aassssoocc44==_n
- Specifies the cache set associativity. For a fully
- associative cache, such as main memory, set _n to any
- sufficiently large number, such as 128. Specify a positive
- integer for _n. Specifying _n=00 indicates that there is no
- cache at that level.
-
- ccmmpp11==_n, ccmmpp22==_n, ccmmpp33==_n, ccmmpp44==_n
- ddmmpp11==_n, ddmmpp22==_n, ddmmpp33==_n, ddmmpp44==_n
- Specifies, in processor cycles, the time for a clean miss
- (ccmmpp_x==) or dirty miss (ddmmpp_x==) to the next outer level of the
- memory hierarchy. This number is approximate because it
- depends upon a clean or dirty line, read or write miss, etc.
- Specify a positive integer for _n. Specifying _n=00 indicates
- that there is no cache at that level.
-
- ccss11==_n, ccss22==_n, ccss33==_n, ccss44==_n
- Specifies the cache size. The value _n can be 0, or it can
- be a positive integer followed by one of the following
- letters: kk, KK, mm, or MM. This specifies the cache size in
- Kbytes or Mbytes. Specifying 0 indicates that there is no
- cache at that level.
-
- ccss11 refers to the primary cache. ccss22 refers to the
- secondary cache. ccss33 refers to memory. ccss44 refers to disk.
- The default cache size for each type of cache depends on
- your system. You can use the --LLIISSTT::ooppttiioonnss==OONN option to see
- the default cache sizes used during your compilation. In
- addition you can enter the following command to see the
- secondary cache size(s) on your system:
-
- hinv -c memory | grep Secondary
-
- iiss__mmeemm11[[ == (( OONN||OOFFFF ))]]
- iiss__mmeemm22[[ == (( OONN||OOFFFF ))]]
- iiss__mmeemm33[[ == (( OONN||OOFFFF ))]]
- iiss__mmeemm44[[ == (( OONN||OOFFFF ))]]
- Specifies that certain memory hierarchies should be modeled
- as memory, not cache. The default is OOFFFF for each option.
-
- Blocking can be attempted for this memory hierarchy level,
- and blocking appropriate for memory, rather than cache, is
- applied. No prefetching is performed, and any prefetching
- options are ignored. If an --OOPPTT::iiss__mmeemm_x[[ == (( OONN||OOFFFF ))]]
- option is specified, the corresponding aassssoocc_x==_n
- specification is ignored, any ccmmpp_x==_n and ddmmpp_x==_n options on
- the command line are ignored.
-
- llss11==_n, llss22==_n, llss33==_n, llss44==_n
- Specifies the line size, in bytes. This is the number of
- bytes, specified in the form of a positive integer number,
- _n, that are moved from the memory hierarchy level further
- out to this level on a miss. Specifying _n=0 indicates that
- there is no cache at that level.
-
- Certain arguments control the TLB. The TLB is a cache for the page
- table, and it is assumed to be fully associative. The TLB control
- arguments are as follows:
-
- ppss11==_n, ppss22==_n, ppss33==_n, ppss44==_n
- Specifies the number of bytes in a page. Specify a positive
- integer for _n. The default _n depends on your system
- hardware.
-
- ttllbb11==_n, ttllbb22==_n, ttllbb33==_n, ttllbb44==_n
- Specifies the number of entries in the TLB for this cache
- level. Specify a positive integer for _n. The default _n
- depends on your system hardware.
-
- ttllbbccmmpp11==_n, ttllbbccmmpp22==_n, ttllbbccmmpp33==_n, ttllbbccmmpp44==_n
- ttllbbddmmpp11==_n, ttllbbddmmpp22==_n, ttllbbddmmpp33==_n, ttllbbddmmpp44==_n
- Specifies the number of processor cycles it takes to service
- a clean TLB miss (the ttllbbccmmpp_x== options) or dirty TLB miss
- (the ttllbbddmmpp_n== options). Specify a positive integer for _n.
- The default _n depends on your system hardware.
-
- The following arguments control the prefetch operation:
-
- ppff11[[ == (( OONN||OOFFFF ))]]
- ppff22[[ == (( OONN||OOFFFF ))]]
- ppff33[[ == (( OONN||OOFFFF ))]]
- ppff44[[ == (( OONN||OOFFFF ))]]
- Selectively disables and enables prefetching for cache level
- _x, for ppff_x[[ == (( OONN||OOFFFF ))]]
-
- When --rr1100000000 or --rr1122000000 are in effect, ppff11==OONN and ppff22==OONN by
- default. At any other --rr_n setting, OOFFFF is in effect for all
- cache levels.
-
- pprreeffeettcchh==_n
- Specifies levels of prefetching. _n can be one of the
- following:
-
- 0 Disables all prefetching. This is the default when
- --rr44000000, --rr55000000, or --rr88000000 is in effect.
-
- 1 Enables conservative prefetching. This is the default
- when --rr1100000000 or --rr1122000000 is in effect.
-
- 2 Enables aggressive prefetching.
-
- pprreeffeettcchh__aahheeaadd==_n
- Prefetches the specified number of cache lines ahead of the
- reference. Specify a positive integer for _n. The default
- is 2.
-
- pprreeffeettcchh__mmaannuuaall[[ == (( OONN||OOFFFF ))]]
- Specifies whether manual prefetches (through directives)
- should be respected or ignored.
-
- pprreeffeettcchh__mmaannuuaall==OOFFFF ignores manual prefetches. This is the
- default when --rr88000000, --rr55000000, or --rr44000000 is in effect.
-
- pprreeffeettcchh__mmaannuuaall==OONN respects manual prefetches. This is the
- default when --rr1100000000 or --rr1122000000 is in effect.
-
- FF7777 LLNNOO DDiirreeccttiivveess
- Directives within a program unit apply only to that program unit,
- reverting to the default values at the end of the program unit.
- Directives that occur outside of a program unit alter the default
- value, and therefore apply to the rest of the file from that point on,
- until overridden by a subsequent directive.
-
- Directives within a file override the command line options by default.
- To have the command line options override directives, use the command
- line option:
-
- -LNO:ignore_pragmas
-
- FFiissssiioonn aanndd FFuussiioonn DDiirreeccttiivveess
- * CC**$$** AAGGGGRREESSSSIIVVEE IINNNNEERR LLOOOOPP FFIISSSSIIOONN: Fission this loop in
- inner_fission phase to as many loops as possible. This must be
- followed by a inner loop and has no effect if that loop is not inner
- any more after the SNL phase.
-
- * CC**$$** FFIISSSSIIOONN [[((_n))]] or CC**$$** FFIISSSSIIOONNAABBLLEE: Fission the enclosing _n
- level of loops after this directive. Perform legality test unless a
- ffiissssiioonnaabbllee directive is also specified. Does not re-order
- statements.
-
- * CC**$$** FFUUSSEE [[((_n [[,,_l_e_v_e_l]] ))]] or CC**$$** FFUUSSAABBLLEE: Fuse the following _n
- immediately adjacent loops. Fusion is attempted on each pair of
- adjacent loops and the _l_e_v_e_l, by default, is the determined by the
- maximal SNL levels of the fused loops, although partial fusion is
- allowed. Iterations may be peeled as needed during fusion; the
- peeling limit is 5 or the number specified by the
- --LLNNOO::ffuussiioonn__ppeeeelliinngg__lliimmiitt flag. When the FFUUSSAABBLLEE directive is
- present, no legality test is done and the fusion is done up to the
- maximal SNL levels where the iteration numbers matched for each pair
- of loops to be matched. The default value for _n is 2.
-
- * CC**$$** NNOO FFIISSSSIIOONN: The loop following this directive should not be
- fissioned in either fiz_fuse phase or inner_fission phase. Its inner
- loops, however, are allowed to be fissioned.
-
- * CC**$$** NNOO FFUUSSIIOONN: The loop following this directive should not be
- fused with other loops.
-
- SSNNLL TTrraannssffoorrmmaattiioonn DDiirreeccttiivveess
- The parallelizing preprocessor may do some transformation for
- parallelism that violate some of these directives.
-
- * CC**$$** IINNTTEERRCCHHAANNGGEE ((_I,, _J [[,,_K ......]] )): Loops _I, _J and _K (in any order)
- must directly follow this directive and be perfectly nested inside
- each other. If they are not perfectly nested, the compiler may
- perform loop distribution to make them so, or may ignore the
- annotation, or may apply imperfect interchange (this is not likely).
- The compiler attempts to reorder loops so that _I is outermost, then
- _J, then _K. The compiler may ignore this directive. There must be a
- minimum of 2 indexes in the directive.
-
- * CC**$$** NNOO IINNTTEERRCCHHAANNGGEE: Prevents the compiler from involving the loop
- directly following this directive in a permutation, or any loop
- nested within this loop.
-
- * CC**$$** BBLLOOCCKKIINNGG SSIIZZEE ((_n_1,,_n_2)) or CC**$$** BBLLOOCCKKIINNGG SSIIZZEE ((_n_1)) or CC**$$**
- BBLLOOCCKKIINNGG SSIIZZEE ((,,_n_2)): If the specified loop is involved in a blocking
- for the primary or secondary cache, it will have a blocksize of _n_1
- or _n_2. The compiler will try to include this loop within such a
- block. If a blocking size is specified as 0, the loop is not
- actually stripped, but the entire loop is inside the block.
-
- * CC**$$** NNOO BBLLOOCCKKIINNGG: Prevent the compiler from involving this loop in a
- cache blocking.
-
- * CC**$$** UUNNRROOLLLL ((_n [[,,_n_2]] )): This directive suggests that _n-_1 copies of
- the loop body be added to the inner loop. If the loop that this
- directive directly preceeds is an inner loop, then it indicates
- standard unrolling. If the loop that this directive directly
- preceeds is not innermost, then outer loop unrolling is performed.
- _n must be at least 1. If _n=1 then no unrolling will be performed.
- If _n=0, then the default unrolling should be applied. _n_2 is
- ignored.
-
- * CC**$$** BBLLOOCCKKAABBLLEE ((_I,,_J [[,,_K ......]] )): The _I, _J and _K loops must be
- adjacent and nested within each other, although not necessarily
- perfectly nested. This directive informs the compiler that these
- loops may legally be involved in a blocking with each other, even if
- the compiler would consider such a transformation illegal. The
- loops are also interchangeable and unrollable. This directive does
- not instruct the compiler which of these transformations to apply.
- You must specify at least 2 loop indexes in the directive.
-
- PPrreeffeettcchh DDiirreeccttiivveess
- * CC**$$** PPRREEFFEETTCCHH ((_n[[,,_n]])): Specify prefetching for each level of the
- cache. The scope is the entire function containing the directive. _n
- can be one of the following values:
-
- 00 prefetching off (default for all processors except R10000)
-
- 11 prefetching on, but conservative
-
- 22 prefetching on, and aggressive (default when prefetch is on)
-
- * CC**$$** PPRREEFFEETTCCHH__MMAANNUUAALL ((_n)): Specify if manual prefetches (through
- directives) should be respected or ignored. Scope: Entire function
- containing the directive. _n can be one of the following values:
-
- 00 ignore manual prefetches (default for mmiippss33 and earlier)
-
- 11 respect manual prefetches (default for mmiippss44)
-
- * CC**$$** PPRREEFFEETTCCHH__RREEFF__DDIISSAABBLLEE==_A [[,, ssiizzee==_n_u_m]]: This directive explicitly
- disables prefetching all references to array _A in the current
- function. The auto-prefetcher runs (if enabled) ignoring array _A.
- The ssiizzee is used for volume analysis. Scope: Entire function
- containing the directive. ssiizzee==_n_u_m is the size of the array
- references in this loop, in Kbytes. This is an optional argument
- and must be a constant.
-
- * CC**$$** PPRREEFFEETTCCHH__RREEFF==_a_r_r_a_y-_r_e_f,,[[ssttrriiddee==[[_s_t_r]] [[,,_s_t_r]]]],, [[lleevveell==[[_l_e_v]]
- [[,,_l_e_v]]]],, [[kkiinndd==[[_r_d/_w_r]]]],, [[ssiizzee==[[_s_z]]]]: This directive generates a
- single prefetch instruction to the specified memory location. It
- searches for array references that match the supplied reference in
- the current loop-nest. If such a reference is found, that reference
- is connected to this prefetch node with the specified latency. If no
- such reference is found, this prefetch node stays free-floating and
- is scheduled "loosely".
-
- All references to this array in this loop-nest are ignored by the
- automatic prefetcher (if enabled).
-
- If the size is supplied, then the auto-prefetcher (if enabled)
- reduces the effective cache size by that amount in its calculations.
-
- The compiler tries to issue one prefetch per stride iteration, but
- cannot guarantee it. Redundant prefetches are preferred to
- transformations (such as inserting conditionals) which incur other
- overhead.
-
- Scope: No scope. Just generates a prefetch instruction.
-
- The following arguments are used with this option:
-
- _a_r_r_a_y-_r_e_f Required. The reference itself, for example, AA((ii,, jj)).
-
- _s_t_r Optional. Prefetch every _s_t_r iterations of this loop. The
- default is 1.
-
- _l_e_v Optional. The level in memory hierarchy to prefetch. The
- default is 2. If _l_e_v=1, prefetch from L2 to L1 cache. If
- _l_e_v=2, prefetch from memory to L1 cache.
-
- _r_d/_w_r Optional. The default is read/write.
-
- _s_z Optional. The size (in Kbytes) of the array referenced in
- this loop. This must be a constant.
-
- DDeeppeennddeennccee AAnnaallyyssiiss DDiirreeccttiivveess
- * CCDDIIRR$$ IIVVDDEEPP: This applies only to inner loops. Liberalize dependence
- analysis. Given two memory references, where at least one is loop
- variant, ignore any loop-carried dependences between the two
- references. The following are examples of this directive.
-
- do i = 1,n
- b(k) = b(k) + a(i)
- enddo
-
- IIVVDDEEPP does not break the dependence because bb((kk)) is not loop-variant.
-
- do i=1,n
- a(i) = a(i-1) + 3
- enddo
-
- IIVVDDEEPP does break the dependence but the compiler warns the user that
- it is breaking an obvious dependence.
-
- do i=1,n
- a(b(i)) = a(b(i)) + 3.
- enddo
-
- IIVVDDEEPP does break the dependence.
-
- do i = 1,n
- a(i) = b(i)
- c(i) = a(i) + 3.
- enddo
-
- IIVVDDEEPP does not break the dependence on aa[[ii]] because it is within an
- iteration.
-
- If --OOPPTT::ccrraayy__iivvddeepp==OONN,, Cray semantics are used and all lexically
- backwards dependences are broken. The following are examples:
-
- do i=1,n
- a(i) = a(i-1) + 3.
- enddo
-
- IIVVDDEEPP does break the dependence but the compiler warns the user that
- it's breaking an obvious dependence.
-
- do i=1,n
- a(i) = a(i+1) + 3.
- enddo
-
- IIVVDDEEPP does not break the dependence because the dependence is from the
- load to the store, and the load comes lexically before the store.
-
- If --OOPPTT::lliibbeerraall__iivvddeepp==OONN,, all dependences are broken.
-
- SSEEEE AALLSSOO
- cccc(1), CCCC(1), ccoorrdd(1), ddssoo(1), ff7777(1), ff9900(1), ffppmmooddee(1), hhiinnvv(1),
- lldd(1), mmaakkee(1), ppiixxiiee(1), ppmmaakkee(1), pprrooff(1), rrlldd(1), ssmmaakkee(1).
-
- mmaatthh(3M).
-
- aauuttoo__pp(5), ggpp__oovveerrffllooww(5), iippaa(5), oopptt(5), ppee__eennvviirroonn(5).
-
- _M_I_P_S_p_r_o _C _a_n_d _C++ _P_r_a_g_m_a_s, publication 007-3587-001
-
- _C _L_a_n_g_u_a_g_e _R_e_f_e_r_e_n_c_e _M_a_n_u_a_l, publication 007-0701-120
-
- _C_o_m_p_i_l_e_r _I_n_f_o_r_m_a_t_i_o_n _F_i_l_e (_C_I_F) _R_e_f_e_r_e_n_c_e _M_a_n_u_a_l
-
- _M_I_P_S_p_r_o _F_o_r_t_r_a_n _7_7 _P_r_o_g_r_a_m_m_e_r'_s _G_u_i_d_e
-
- _M_I_P_S_p_r_o _7 _F_o_r_t_r_a_n _9_0 _C_o_m_m_a_n_d_s _a_n_d _D_i_r_e_c_t_i_v_e_s _R_e_f_e_r_e_n_c_e _M_a_n_u_a_l
-
- _M_I_P_S_p_r_o _6_4-_B_i_t _P_o_r_t_i_n_g _a_n_d _T_r_a_n_s_i_t_i_o_n _G_u_i_d_e
-
- This man page is available only online.
-